Picture for Lu Yin

Lu Yin

One LR Doesn't Fit All: Heavy-Tail Guided Layerwise Learning Rates for LLMs

Add code
May 21, 2026
Viaarxiv icon

MedFM-Robust: Benchmarking Robustness of Medical Foundation Models

Add code
May 21, 2026
Viaarxiv icon

ELAS: Efficient Pre-Training of Low-Rank Large Language Models via 2:4 Activation Sparsity

Add code
May 05, 2026
Viaarxiv icon

TSegAgent: Zero-Shot Tooth Segmentation via Geometry-Aware Vision-Language Agents

Add code
Mar 20, 2026
Viaarxiv icon

W2T: LoRA Weights Already Know What They Can Do

Add code
Mar 16, 2026
Viaarxiv icon

A Survey of Weight Space Learning: Understanding, Representation, and Generation

Add code
Mar 10, 2026
Viaarxiv icon

Progressive Residual Warmup for Language Model Pretraining

Add code
Mar 05, 2026
Viaarxiv icon

Why Diffusion Language Models Struggle with Truly Parallel (Non-Autoregressive) Decoding?

Add code
Feb 26, 2026
Viaarxiv icon

Search or Accelerate: Confidence-Switched Position Beam Search for Diffusion Language Models

Add code
Feb 11, 2026
Viaarxiv icon

Long Chain-of-Thought Compression via Fine-Grained Group Policy Optimization

Add code
Feb 10, 2026
Viaarxiv icon